Voice activity detector based on enhanced cumulant of LPC residual and on-line EM algorithm

نویسندگان

  • David Cournapeau
  • Tatsuya Kawahara
  • Kenji Mase
  • Tomoji Toriyama
چکیده

This paper addresses the problem of segmenting audio data recorded with embedded devices for the purpose of intelligent sensing in the context of multi-modal interactions. We propose a real-time method for robust speech detection in natural, noisy environments. It is based on a fusion of high order statistics of the LPC residual and autocorrelation, and adopts an on-line version of Expectation Maximization algorithm for the classification. Experimental evaluations show that the proposed method provides better detection performance under different types of natural noises, working robustly against other voices in the context of multi-speaker interactive situations. As the proposed method is based on features which have a low computational cost, and has a small latency, it is suitable for real-time tracking applications.

منابع مشابه

A New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)

Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...

متن کامل

The fourth-order cumulant of speech signals with application to voice activity detection

This paper explores the fourth order cumulants (FOC) of the LPC residual of speech signals and presents a new algorithm for Voice Activity detection (VAD) based on the newly established FOC properties. Analytical expressions for the horizontal slice of the 4th cumulant as well as the kurtosis of voiced speech are derived based on a reported sinusoidal model [4]. The derivations demonstrate that...

متن کامل

Robust voice activity detection using higher-order statistics in the LPC residual domain

This paper presents a robust algorithm for voice activity detection (VAD) based on newly established properties of the higher order statistics (HOS) of speech. Analytical expressions for the third and fourth-order cumulants of the LPC residual of short-term speech are derived assuming a sinusoidal model. The flat spectral feature of this residual results in distinct characteristics for these cu...

متن کامل

Design and evaluation of a voice conversion algorithm based on spectral envelope mapping and residual prediction

The purpose of a voice conversion (VC) system is to change the perceived speaker identity of a speech signal. In this paper, we propose a new algorithm based on converting the LPC spectrum and predicting the residual as a function of the target envelope parameters. We conduct listening tests based on speaker discrimination of same/difference pairs to measure the accuracy by which the converted ...

متن کامل

An Improved Method of Speech Compression Using Warped Lpc and Mlt-spiht Algorithm

Frequency –warped signal processing techniques are attractive to many wideband speech and audio applications since they have a clear connection to the frequency resolution of human hearing. A warped version of the linear predictive coding (LPC) for speech compression is implemented in this paper and an analysis of the application of Set Partitioning In Hierarchical Trees (SPIHT) algorithm to th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

متن کامل
عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006